AITopics | example usage

Collaborating Authors

example usage

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The recent progress of large language model agents has opened new possibilities for automating tasks through graphical user interfaces (GUIs), especially in mobile environments where intelligent interaction can greatly enhance usability. However, practical deployment of such agents remains constrained by several key challenges. Existing training data is often noisy and lack semantic diversity, which hinders the learning of precise grounding and planning. Models trained purely by imitation tend to overfit to seen interface patterns and fail to generalize in unfamiliar scenarios. Moreover, most prior work focuses on English interfaces while overlooks the growing diversity of non-English applications such as those in the Chinese mobile ecosystem. In this work, we present AgentCPM-GUI, an 8B-parameter GUI agent built for robust and efficient on-device GUI interaction. Our training pipeline includes grounding-aware pre-training to enhance perception, supervised fine-tuning on high-quality Chinese and English trajectories to imitate human-like actions, and reinforcement fine-tuning with GRPO to improve reasoning capability. We also introduce a compact action space that reduces output length and supports low-latency execution on mobile devices. AgentCPM-GUI achieves state-of-the-art performance on five public benchmarks and a new Chinese GUI benchmark called CAGUI, reaching $96.9\%$ Type-Match and $91.3\%$ Exact-Match. To facilitate reproducibility and further research, we publicly release all code, model checkpoint, and evaluation data.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.01391

Country: Asia (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Graphics (1.00)
Information Technology > Communications > Mobile (1.00)
(3 more...)

Add feedback

Smoothing Grounding and Reasoning for MLLM-Powered GUI Agents with Query-Oriented Pivot Tasks

Wu, Zongru, Cheng, Pengzhou, Wu, Zheng, Ju, Tianjie, Zhang, Zhuosheng, Liu, Gongshen

arXiv.org Artificial IntelligenceMar-4-2025

Perception-enhanced pre-training, particularly through grounding techniques, is widely adopted to enhance the performance of graphical user interface (GUI) agents. However, in resource-constrained scenarios, the format discrepancy between coordinate-oriented grounding and action-oriented reasoning limits the effectiveness of grounding for reasoning tasks. To address this challenge, we propose a query-oriented pivot approach called query inference, which serves as a bridge between GUI grounding and reasoning. By inferring potential user queries from a screenshot and its associated element coordinates, query inference improves the understanding of coordinates while aligning more closely with reasoning tasks. Experimental results show that query inference outperforms previous grounding techniques under the same training data scale. Notably, query inference achieves comparable or even better performance to large-scale grounding-enhanced OS-Atlas with less than 0.1% of training data. Furthermore, we explore the impact of reasoning formats and demonstrate that integrating additional semantic information into the input further boosts reasoning performance. The code is publicly available at https://github.com/ZrW00/GUIPivot.

example usage, query inference, reasoning, (13 more...)

arXiv.org Artificial Intelligence

2503.00401

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.67)

Technology:

Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Detection of Non-recorded Word Senses in English and Swedish

Lautenschlager, Jonathan, Sköldberg, Emma, Hengchen, Simon, Schlechtweg, Dominik

arXiv.org Artificial IntelligenceMar-4-2024

This study addresses the task of Unknown Sense Detection in English and Swedish. The primary objective of this task is to determine whether the meaning of a particular word usage is documented in a dictionary or not. For this purpose, sense entries are compared with word usages from modern and historical corpora using a pre-trained Word-in-Context embedder that allows us to model this task in a few-shot scenario. Additionally, we use human annotations to adapt and evaluate our models. Compared to a random sample from a corpus, our model is able to considerably increase the detected number of word usages with non-recorded senses.

headword, usage, word usage, (15 more...)

arXiv.org Artificial Intelligence

2403.02285

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Saxony > Leipzig (0.05)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
(13 more...)

Genre: Research Report (0.40)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

LILO: Learning Interpretable Libraries by Compressing and Documenting Code

Grand, Gabriel, Wong, Lionel, Bowers, Matthew, Olausson, Theo X., Liu, Muxin, Tenenbaum, Joshua B., Andreas, Jacob

arXiv.org Artificial IntelligenceOct-30-2023

While large language models (LLMs) now excel at code generation, a key aspect of software development is the art of refactoring: consolidating code into libraries of reusable and readable programs. In this paper, we introduce LILO, a neurosymbolic framework that iteratively synthesizes, compresses, and documents code to build libraries tailored to particular problem domains. LILO combines LLM-guided program synthesis with recent algorithmic advances in automated refactoring from Stitch: a symbolic compression system that efficiently identifies optimal lambda abstractions across large code corpora. To make these abstractions interpretable, we introduce an auto-documentation (AutoDoc) procedure that infers natural language names and docstrings based on contextual examples of usage. In addition to improving human readability, we find that AutoDoc boosts performance by helping LILO's synthesizer to interpret and deploy learned abstractions. We evaluate LILO on three inductive program synthesis benchmarks for string editing, scene reasoning, and graphics composition. Compared to existing neural and symbolic methods - including the state-of-the-art library learning algorithm DreamCoder - LILO solves more complex tasks and learns richer libraries that are grounded in linguistic knowledge.

abstraction, library, preprint, (16 more...)

arXiv.org Artificial Intelligence

2310.19791

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.04)
(19 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Education > Educational Setting > Continuing Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ChengBinJin/MRI-to-CT-DCNN-TensorFlow

#artificialintelligenceSep-20-2019, 08:37:57 GMT

This repository is an implementation of "MR‐based synthetic CT generation using a deep convolutional neural network method." This toy dataset just includes 367 paired images. We randomly divide data into training, validation, and test. Use main.py to train a DCNN model. Use main.py to test the DCNN model.

chengbinjin mri-to-ct-dcnn-tensorflow, dcnn model, example usage, (2 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.64)

Add feedback

deepmind/deepmind-research

#artificialintelligenceSep-17-2019, 20:28:46 GMT

This repository contains the trained model and dataset used for Unsupervised Adversarial Training (UAT) from the paper Are Labels Required for Improving Adversarial Robustness? Our model is available via TF-Hub. For example usage, refer to quick_eval_cifar.py. The preferred method of running this script is through run.sh, which will set up a virtual environment, install the dependendencies, and run the evaluation script, which will print the adversarial accuracy of the model. Note this file is very large, and requires 227 GB of disc space.

large language model, machine learning, natural language, (5 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.76)

Add feedback

BenWhetton/keras-surgeon

@machinelearnbotAug-24-2017, 17:55:53 GMT

Keras-surgeon provides simple methods for modifying trained Keras models. Keras-surgeon is compatible with any model architecture. Any number of layers can be modified in a single traversal of the network. These kinds of modifications are sometimes known as network surgery which inspired the name of this package. The operations module contains simple methods to perform network surgery on a single layer within a model.

artificial intelligence, benwhetton keras-surgeon, social media, (11 more...)

@machinelearnbot

Technology:

Information Technology > Artificial Intelligence (0.53)
Information Technology > Communications > Social Media (0.51)

Add feedback